proposition ii
Proximal Operators of Sorted Nonconvex Penalties
Gagneux, Anne, Massias, Mathurin, Soubies, Emmanuel
--This work studies the problem of sparse signal recovery with automatic grouping of variables. T o this end, we investigate sorted nonsmooth penalties as a regularization approach for generalized linear models. These penalties are designed to promote clustering of variables due to their sorted nature, while the nonconvexity reduces the shrinkage of coefficients. Our goal is to provide efficient ways to compute their proximal operator, enabling the use of popular proximal algorithms to solve composite optimization problems with this choice of sorted penalties. We distinguish between two classes of problems: the weakly convex case where computing the proximal operator remains a convex problem, and the nonconvex case where computing the proximal operator becomes a challenging nonconvex combinatorial problem. We demonstrate the practical interest of using such penalties on several experiments. R is a data-fidelity term and the penalty Ψ is a regularization term that should embed some properties of the solution. Among them, sparsity and structure are particularly useful for a model as they improve its in-terpretability and decrease its complexity. Sparsity is most usually enforced through a penalty term favoring variable selection, i.e. solutions that use only a subset of features.
Building Intelligent Databases through Similarity: Interaction of Logical and Qualitative Reasoning
In this article, we present a novel method for assessing the similarity of information within knowledge-bases using a logical point of view. This proposal introduces the concept of a similarity property space $\Xi$P for each knowledge K, offering a nuanced approach to understanding and quantifying similarity. By defining the similarity knowledge space $\Xi$K through its properties and incorporating similarity source information, the framework reinforces the idea that similarity is deeply rooted in the characteristics of the knowledge being compared. Inclusion of super-categories within the similarity knowledge space $\Xi$K allows for a hierarchical organization of knowledge, facilitating more sophisticated analysis and comparison. On the one hand, it provides a structured framework for organizing and understanding similarity. The existence of super-categories within this space further allows for hierarchical organization of knowledge, which can be particularly useful in complex domains. On the other hand, the finite nature of these categories might be restrictive in certain contexts, especially when dealing with evolving or highly nuanced forms of knowledge. Future research and applications of this framework focus on addressing its potential limitations, particularly in handling dynamic and highly specialized knowledge domains.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Ireland > Connaught > County Galway > Galway (0.04)
- Europe > France > Brittany > Finistère > Brest (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.55)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.47)
Instabilities in Convnets for Raw Audio
Haider, Daniel, Lostanlen, Vincent, Ehler, Martin, Balazs, Peter
What makes waveform-based deep learning so hard? Despite numerous attempts at training convolutional neural networks (convnets) for filterbank design, they often fail to outperform hand-crafted baselines. These baselines are linear time-invariant systems: as such, they can be approximated by convnets with wide receptive fields. Yet, in practice, gradient-based optimization leads to suboptimal approximations. In our article, we approach this phenomenon from the perspective of initialization. We present a theory of large deviations for the energy response of FIR filterbanks with random Gaussian weights. We find that deviations worsen for large filters and locally periodic input signals, which are both typical for audio signal processing applications. Numerical simulations align with our theory and suggest that the condition number of a convolutional layer follows a logarithmic scaling law between the number and length of the filters, which is reminiscent of discrete wavelet bases.
- Europe > Austria > Vienna (0.14)
- Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Accelerated Algorithms for a Class of Optimization Problems with Equality and Box Constraints
Parashar, Anjali, Srivastava, Priyank, Annaswamy, Anuradha M.
Convex optimization with equality and inequality constraints is a ubiquitous problem in several optimization and control problems in large-scale systems. Recently there has been a lot of interest in establishing accelerated convergence of the loss function. A class of high-order tuners was recently proposed in an effort to lead to accelerated convergence for the case when no constraints are present. In this paper, we propose a new high-order tuner that can accommodate the presence of equality constraints. In order to accommodate the underlying box constraints, time-varying gains are introduced in the high-order tuner which leverage convexity and ensure anytime feasibility of the constraints. Numerical examples are provided to support the theoretical derivations.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Data-Driven Control with Inherent Lyapunov Stability
Min, Youngjae, Richards, Spencer M., Azizan, Navid
Recent advances in learning-based control leverage deep function approximators, such as neural networks, to model the evolution of controlled dynamical systems over time. However, the problem of learning a dynamics model and a stabilizing controller persists, since the synthesis of a stabilizing feedback law for known nonlinear systems is a difficult task, let alone for complex parametric representations that must be fit to data. To this end, we propose Control with Inherent Lyapunov Stability (CoILS), a method for jointly learning parametric representations of a nonlinear dynamics model and a stabilizing controller from data. To do this, our approach simultaneously learns a parametric Lyapunov function which intrinsically constrains the dynamics model to be stabilizable by the learned controller. In addition to the stabilizability of the learned dynamics guaranteed by our novel construction, we show that the learned controller stabilizes the true dynamics under certain assumptions on the fidelity of the learned dynamics. Finally, we demonstrate the efficacy of CoILS on a variety of simulated nonlinear dynamical systems.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Riesz-Quincunx-UNet Variational Auto-Encoder for Satellite Image Denoising
Thai, Duy H., Fei, Xiqi, Le, Minh Tri, Züfle, Andreas, Wessels, Konrad
Multiresolution deep learning approaches, such as the U-Net architecture, have achieved high performance in classifying and segmenting images. However, these approaches do not provide a latent image representation and cannot be used to decompose, denoise, and reconstruct image data. The U-Net and other convolutional neural network (CNNs) architectures commonly use pooling to enlarge the receptive field, which usually results in irreversible information loss. This study proposes to include a Riesz-Quincunx (RQ) wavelet transform, which combines 1) higher-order Riesz wavelet transform and 2) orthogonal Quincunx wavelets (which have both been used to reduce blur in medical images) inside the U-net architecture, to reduce noise in satellite images and their time-series. In the transformed feature space, we propose a variational approach to understand how random perturbations of the features affect the image to further reduce noise. Combining both approaches, we introduce a hybrid RQUNet-VAE scheme for image and time series decomposition used to reduce noise in satellite imagery. We present qualitative and quantitative experimental results that demonstrate that our proposed RQUNet-VAE was more effective at reducing noise in satellite imagery compared to other state-of-the-art methods. We also apply our scheme to several applications for multi-band satellite images, including: image denoising, image and time-series decomposition by diffusion and image segmentation.
- North America > United States > Virginia (0.14)
- North America > United States > Maryland (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
The power of deeper networks for expressing natural functions
Deep learning has lately been shown to be a very powerful tool for a wide range of problems, from image segmentation to machine translation. Despite its success, many of the techniques developed by practitioners of artificial neural networks (ANNs) are heuristics without theoretical guarantees. Perhaps most notably, the power of feedforward networks with many layers (deep networks) has not been fully explained. The goal of this paper is to shed more light on this question and to suggest heuristics for how deep is deep enough. It is well-known [1-3] that nonlinear neural networks with a single hidden layer can approximate any function under reasonable assumptions, but it is possible that the networks required will be extremely large. Recent authors have shown that some functions can be approximated by deeper networks much more efficiently (i.e. with fewer neurons) than by shallower ones.
Complete Dictionary Recovery over the Sphere I: Overview and the Geometric Picture
Sun, Ju, Qu, Qing, Wright, John
We consider the problem of recovering a complete (i.e., square and invertible) matrix $\mathbf A_0$, from $\mathbf Y \in \mathbb{R}^{n \times p}$ with $\mathbf Y = \mathbf A_0 \mathbf X_0$, provided $\mathbf X_0$ is sufficiently sparse. This recovery problem is central to theoretical understanding of dictionary learning, which seeks a sparse representation for a collection of input signals and finds numerous applications in modern signal processing and machine learning. We give the first efficient algorithm that provably recovers $\mathbf A_0$ when $\mathbf X_0$ has $O(n)$ nonzeros per column, under suitable probability model for $\mathbf X_0$. In contrast, prior results based on efficient algorithms either only guarantee recovery when $\mathbf X_0$ has $O(\sqrt{n})$ zeros per column, or require multiple rounds of SDP relaxation to work when $\mathbf X_0$ has $O(n^{1-\delta})$ nonzeros per column (for any constant $\delta \in (0, 1)$). } Our algorithmic pipeline centers around solving a certain nonconvex optimization problem with a spherical constraint. In this paper, we provide a geometric characterization of the objective landscape. In particular, we show that the problem is highly structured: with high probability, (1) there are no "spurious" local minimizers; and (2) around all saddle points the objective has a negative directional curvature. This distinctive structure makes the problem amenable to efficient optimization algorithms. In a companion paper (arXiv:1511.04777), we design a second-order trust-region algorithm over the sphere that provably converges to a local minimizer from arbitrary initializations, despite the presence of saddle points.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
Evolutionary Inference for Function-valued Traits: Gaussian Process Regression on Phylogenies
Jones, Nick S., Moriarty, John
In this paper we consider statistical inference for function-valued data which are correlated due to phylogenetic relationships. A schematic example is given in Figure 1A: in this case, given functional data observed at the tips of a phylogeny, the task is to perform inference on the (unobserved) functional data at the root of the phylogeny. Alternatively, if the phylogeny is uncertain we may wish to perform phylogenetic inference, or our interest may be inferring the dynamics of the evolutionary process which produced the data. The term'function-valued' is meant in the sense of [1], where a datum is a continuous functionf (x) of a variablex, such as time or temperature: an examples are therefore curves for ambient temperature versus growth rate for caterpillars, a heart rhythm time series [2], or a spectrogram of audio data. Our approach is to combine the theory of Gaussian processes with assumptions from phylogenetics, to obtain a flexible nonparametric model for such data.